Identifying Symptoms and Diseases in MedNLP Japanese Materials Using Chinese Resources

نویسندگان

  • Lun-Wei Ku
  • Edward T.-H. Chu
  • Cheng-Wei Sun
  • Wan-Lun Li
چکیده

In this paper, we describe the Sinica-Yuntech system (TeamID: SinicaNLP) at the NTCIR-10 MedNLP task. Materials of the MedNLP task are in Japanese. However, having only Chinese resources and knowledge, we need to translate these materials into Chinese. Two preprocessing approaches, different in the timing of translation, were taken. One was to translate Japanese sentences in to Chinese ones, and then to perform segmentation and part of speech tagging on these Chinese sentences; the other was to segment and tag parts of speech on Japanese sentences, and then to translate the composite words. After knowing words and their parts of speech, we identified symptoms and diseases by a vocabulary matching approach. The Internet searching results and parts of speech patterns were also utilized to recognize out of vocabulary symptoms. After recognizing the targets in Chinese, a reverse translation was performed in order to label the original Japanese materials. We merged the tags from vocabulary matching, Internet searching and pattern mapping to obtain the performance of our best run: an f-score 53.88 and an accuracy 91.46.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clinical Entity Recognition Using Cost-Sensitive Structured Perceptron for NTCIR-10 MedNLP

This paper reports on our approach to the NTCIR-10 MedNLP task, which aims at identifying personal and medical information in Japanese clinical texts. We applied a machine learning (ML) algorithm for sequential labeling, specifically, structured perceptron, and defined a cost function for lowering misclassification cost. On the test set provided by the organizers, our approach achieved an F-sco...

متن کامل

Evaluation of Genetic Diversity in Japanese and English White Quail Populations Using Microsatellite Markers

The Japanese and English White quails are widespread strains and belongs to the Galliformes order, Phasianidae family, Coturnix genus  and Japonica species. These birds are likely to be well-adapted to the hard conditions and resistance to diseases as it has attained economic importance as an agricultural species. In the current study, the genetic variation of Japanese and English White  quail ...

متن کامل

Endogenous Gases or Wind as Important Etiology of Diseases in Persian Medicine

 Background and purpose: Sometimes, some symptoms do not respond to usual treatments, or are not justified by classical medicine. In such cases, Persian Medicine can be helpful to better understand and treat the diseases. Endogenous gases (wind or Rih) are among the causes that should be investigated. The purpose of this study was to introduce endogenous gases and etiology of their production i...

متن کامل

NTCIR-10 MedNLP Task Baseline System

Natural language processing (NLP) technology that handles clinical, medical and health records has been drawn much attention, because such kinds of records potentially could be rich clinical resources. This paper describes an NLP system that extracts two kinds of information from clinical documents in Japanese, which was developed as a baseline system in the NTCIR-10 MedNLP Pilot Task. Since ou...

متن کامل

Microsatellite mapping of quantitative trait loci affecting carcass traits on chromosome 1 in half-sib families of Japanese quail (Coturnix japonica)

The objective of this study was to identify the quantitative trait loci (QTL) affecting carcass traits on chromosome 1 in Japanese quail. The populations comprised of 422 progeny in 9 half-sib families. Phenotypic data on carcass weight, carcass parts, and the internal organs were collected on 422 progeny. Nine half-sib families were genotyped for 8 microsatellite markers covering chromosomes 1...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013